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Abstract. We provide an optimally mixing Markov chain for 6-colorings 
of the square lattice on rectangular regions with free, fixed, or toroidal 
boundary conditions. This implies that the uniform distribution on the 
set of such colorings has strong spatial mixing, so that the 6-state Potts 
antiferromagnet has a finite correlation length and a unique Gibbs mea- 
sure at zero temperature. Four and five are now the only remaining val- 
ues of q for which it is not known whether there exists a rapidly mixing 
Markov chain for g-colorings of the square lattice. 



1 Introduction 

Sampling and counting graph colorings is a fundamental problem in computer 
science and discrete mathematics. It is also of fundamental interest in statis- 
tical physics: graph colorings correspond to the zero-temperature case of the 
antiferromagnetic Potts model, a model of magnetism on which physicists have 
performed extensive numerical experiments (see for instance llil8.T7j '). 

Physicists wish to estimate quantities such as spatial correlations and mag- 
netization, and to do this they sample random states using Markov chains. This 
is a general technique whereby one starts with an arbitrary state, and then re- 
peatedly modifies it using a random rule. For the zero-temperature Potts model, 
two standard such rules described below are Glauber dynamics and Kempe chain 
flips, also known as the Wang-Swendsen-Kotecky algorithm. While these algo- 
rithms often appear to work well in practice, we would like to know that their 
mixing time — i.e., the time it takes them to achieve a nearly- uniform distribution 
on the set of states — is polynomially bounded as a function of the size of the 
lattice. Establishing this rigorously has been a major project in mathematical 
physics and theoretical computer science; see for example |3I13I15I2(JI24| . 

Moreover, optimal temporal mixing, i.e., a mixing time of 0(n log n), is deeply 
related to the physical properties of the system JUj. In particular, under certain 
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conditions on the Markov chain, it imphes spatial mixing, i.e., the exponential 
decay of correlations, and thus the existence of a finite correlation length and the 
uniqueness of the Gibbs measure. Therefore, optimal mixing of natural Markov 
chains for g-colorings of the square lattice is considered a major open problem in 
physics (see e.g. Physicists have conjectured |11I23| that the g-state Potts 

model has spatial mixing for g > 4, even at zero temperature. As we discuss 
below, previous results (13.20k3ll) have established this rigorously for q > 7. 

Our main result is that the square lattice has strong spatial mixing for q — 6, 
and in particular that a natural Markov chain has optimal temporal mixing. We 
consider the so-called block heat-bath dynamics, which we call M{i,j). At each 
step, for fixed integers i and j, we choose a.n i x j block S of the lattice G, 
uniformly at random from among all such blocks contained in G. Let C be the 
set of q-colorings of S which are consistent with the coloring of G\ S*. We choose 
a uniformly random coloring c G C and recolor S with c. Our main theorem is: 

Theorem 1. Af(2,3) on 6-colorings of the square lattice mixes in O(nlogn) 
time. 

We prove Theorem^for finite rectangular regions with free, fixed, or toroidal 
boundary conditions. Our method is similar to that of in that it consists of a 
computer-assisted proof of the existence of a path coupling. At the same time, 
we exploit the specific geometry of the square lattice to consider a greater variety 
of neighborhoods. Moreover, the calculations necessary to find a good coupling 
in our setting are far more complicated than those in Pj and require several new 
ideas to become computationally tractable. 

Using the comparison method of Diaconis and Saloff-Coste |6I19| , Theorem^ 
implies that the Glauber and Kempe chain Markov chains also mix in polynomial 
time: 

Theorem 2. The Glauber and Kempe chain Markov chains on 6-colorings of 
the square lattice mix in 0(n^ log n) time. 

Like Theorem n this result holds on finite rectangular regions with free, fixed, 
or toroidal boundary conditions. 

To discuss spatial mixing, suppose we have a finite region V and two colorings 
C, C" of its boundary that differ at a single vertex v, and a subregion U CV such 
that the distance from v to the nearest point u G U is £. Let ji and ^' denote 
the probability distributions on colorings of U, given the uniform distribution 
on colorings of V conditioned on C and C' respectively. Then we define spatial 
mixing as follows: 

Definition 1 r jlOp . We say that q-colorings have strong spatial mixing if there 
are constants a, f3 > such that the total variation distance between /i and fj,' 
obeys \\fi - fi'\\ < l3\U\ exp(-a^). 

In other words, strong spatial mixing means that conditioning a uniformly 
random coloring of the lattice on the event that particular colors appear on 
vertices far away from v has an exponentially small effect on the conditional 



distribution of the color of v. Physically, this means that correlations decay 
exponentially as a function of distance, and that the system has a unique Gibbs 
measure and no phase transition. 

The following recent result of Dyer, Sinclair, Vigoda and Weitz ITU] (see 
also the lecture notes by Martinelli jJEj) relates optimal temporal mixing with 
spatial mixing: if the boundary constraints are permissive, i.e., if a finite region 
can always be colored no matter how we color its boundary, and if the heat- 
bath dynamics on some finite block mixes in 0{nlogn) time, then the system 
has strong spatial mixing. As they point out, g-colorings are permissive for any 
q > A+1. Therefore, Theorem 1, which states that M(2, 3) has optimal temporal 
mixing, implies the following result about spatial correlations. 

Corollary 1. The uniform measure on the set of q- colorings of the square lat- 
tice, or equivalently the zero-temperature antiferromagnetic q-state Potts model 
on the square lattice, has strong spatial mixing for q > 6. 

As mentioned above, physicists conjecture spatial mixing for q > 4. In the last 
section we discuss to what extent our techniques might be extended to g = 4, 5. 

1.1 Markov Chains, Mixing Times and Earlier Work 

Given a Markov chain M, let tt be its stationary distribution and P* be the 
probability distribution after t steps starting with an initial configuration x. 
Then, for a given e > 0, the e-mixing time of M is 

Tj = max min 1 1 : — tt < e| 
where — 7r|| denotes the total variation distance 

V 

In this paper we will often adopt the common practice of suppressing the de- 
pendence on e, which is typically logarithmic, and speak just of the mixing time 
T for fixed small e. Thus the mixing time becomes a function of n, the number 
of vertices, alone. We say that a Markov chain has rapid mixing if r = poly(n), 
and optimal (temporal) mixing if t = 0(n log n). 

The most common Markov chain for the Potts model is Glauber dynamics. 
There are several variants of this in the literature, but for colorings we fix the 
following definition, which applies at zero temperature. At each step, choose a 
random vertex v G G. Let 5' be the set of colors, and let T be the set of colors 
taken by v's neighbors. Then choose a color c uniformly at random from S \ T, 
i.e., from among the colors consistent with the coloring of G — {v}, and recolor 
V with c. 

Independently, Jerrum ^21 and Salas and Sokal [201 proved that for g-colorings 
on a graph of maximum degree A the Glauber dynamics (i) is ergodic for 
q> A + 2 (this holds for fixed boundary conditions as well) and (ii) has optimal 



mixing for q > 2A. For q — 2A, Bubley and Dyer [2| showed that it mixes in 
0{n^) time and Molloy ^Hl showed that it has optimal mixing. Since the square 
lattice has A — A, these results imply optimal mixing for q > 8. 

Dyer and Greenhill 8^ considered a "heat bath" Markov chain which updates 
both endpoints of a random edge simultaneously, and showed that it has optimal 
mixing for q > 2A. By widening the updated region to include a vertex and all of 
its neighbors, Bubley, Dyer and Greenhill 3, showed optimal mixing for q > 7 for 
triangle-free graphs with maximum degree 4, which includes the square lattice. 

Another commonly used Markov chain is the Kempe chain algorithm, known 
in physics as the zero-temperature case of the Wang-Swendsen-Kotecky algo- 
rithm |25I26| . It works as follows: we choose a random vertex v and a color b 
which differs from u's current color a. We construct the largest connected sub- 
graph containing v which is colored with a and 6, and recolor this subgraph by 
switching a and b. In a major breakthrough, Vigoda [23 showed that a similar 
Markov chain has optimal mixing for q > (11/6)Z\, and that this implied that 
the Glauber dynamics and the Kempe chain algorithm both have rapid mixing 
for q > (11/6)Z\. However, for the square lattice this again gives only q> 8. 

For 5 = 3 on the square lattice, Luby, Randall and Sinclair 14 showed that 
a Markov chain including "tower moves" has rapid mixing for any finite simply- 
connected region with fixed boundary conditions, and Randall and Tetali |19 | 
showed that this implies rapid mixing for the Glauber dynamics as well. Re- 
cently Goldberg, Martin and Paterson 12 proved rapid mixing for the Glauber 
dynamics on rectangular regions with free boundary conditions, i.e., with no 
fixed coloring of the vertices on their boundary. Unfortunately, the technique 
of 14 12 relies on a bijection between 3-colorings and random surfaces through 
a "height representation" which does not hold for other values of q. 

2 Coupling 

We consider two parallel runs of our Markov chain, M(2, 3), with initial colorings 
Xo,Yo. We will couple the steps of these chains in such a way that (i) each 
chain runs according to the correct distribution on its choices and (ii) with high 
probability, Xt — Yt for some t — 0{nlogn). A now standard fact in this area 
is that this implies that the chain mixes in time O(nlogn), i.e. this implies 
Theorem n this fact was first proved by Aldous P (see also [S]). 

Bubley and Dyer j^j introduced the very useful technique of Path Coupling, 
via which it suffices to do the following: Consider any two colorings X, Y which 
have a Hamming distance one, i.e., which differ on exactly one vertex, and carry 
out a single step of the chain on X and on Y, producing two new colorings 
X' , Y' . We will prove that we can couple these two steps such that (i) each step 
is selected according to the correct distribution, and (ii) the expected Hamming 
distance between X' and Y' is at most 1 — e/n for some constant e > 0. Thus the 
expected change in the Hamming distance between the two colorings is negative, 
— e/n. See, e.g., [H| for the formal (by now standard) details as to why this suffices 
to prove Theorem n 



We perform the required coupling as follows. Let X and Y be two arbitrary 
6-colorings which only disagree at one vertex. We pick a uniformly random 2x3 
block 5, and let Cx and Cy denote the set of permissible recolorings of S 
according to X, Y respectively. For each c G Cx , we define a carefully chosen 
probability distribution pc on the colorings of Cy . We pick a uniformly random 
coloring ci £ Cx and in X we recolor S with ci to produce X' . We then pick a 
random coloring C2 G Cy according to the distribution p^i and in Y we recolor 
S with C2 to produce Y' . Trivially, the marginal distribution (5, ci) is uniform 
on S and Cx- In order to ensure that the same is true of {8,02), we must have 
the following property for the set of distributions {pc : c G Cx}' 



Suppose that v is the vertex on which X,Y differ. If v G S* then Cx — Cy, 
so we can simply define C2 = Ci (i.e., Pc{c) — 1 for each c) and this ensures that 
X' = Y' . If S does not contain v or any neighbor of v, then again Cx = Cy and 
by defining C2 = Ci we ensure that X' , Y' differ only on v. If v is not in S but is 
adjacent to a vertex in 5, then Cx ^ Cy and so, depending on our coupling, it 
is quite possible that C2 ^ ci and so X' , Y' will differ on one or more vertices of 
S as well as on w. 

For any pair Cx,Cy, we let H{Cx,Cy) denote the expected number of 
vertices in S on which ci, C2 differ. For every possible pair Cx, Cy we obtain a 
coupling satisfying |^ and: 



Thus the expected change in the Hamming distance between the two colorings 
is — 1 if i; G S* and less than 0.52 if v is adjacent to S. 

Proof of Theorem ^ As described above, it suffices to prove that for any 
choice of X, Y differing only at v, the expected Hamming distance between X' 
and Y' is less than 1 — e/n for some e > 0, or equivalently that the expected 
change in the Hamming distance is less than —e/n. 

Let us consider toroidal boundary conditions first; there are n possible blocks 
from which the algorithm will choose, one for each vertex. Each vertex is con- 
tained in 6 blocks and is adjacent to 10 blocks, so u G 5, or w is adjacent to 
S, with probability 6/n or 10/n respectively. Thus the expected change in the 
Hamming distance is less than 



For rectangular regions with free boundary conditions the situation is a little 
more complicated. A natural extension of the basic procedure is to chose any 
block that lies entirely in the region. This, however, means that a region of height 
h and width w with n vertices will only have n — 2h — w + 2 possible 2x3 blocks; 



for each C2 G Cy 




(1) 



H{Cx,Cy) < 0.52 . 



(2) 



6 10 0.8 
- + 0.52 X — = . 

n n n 



furthermore, when the distinguished vertex v is on or close to the boundary, the 
number of blocks containing v is less than the 6 we have when v is snugly in the 
interior, thereby lowering our chances of decreasing the Hamming distance. 

It turns out that if we are dealing with free boundary conditions the natural 
extension will work anjrway, since then the maximum expected change in the 
Hamming distance when v is on or close to the boundary and adjacent to our 
block is low enough to counteract the reduced chance that our choice of block 
will contain v. In particular, we obtained these values for the expected change 
in the Hamming distance when v is adjacent to the block: 

V is on boundary, block is not: < 0.52 i.e., the interior case 
block is on boundary, v is not: < 0.503 
both V and block are on boundary: < 0.390 

As an example, when v is in the corner of the lattice we have one choice of a 
block containing v, and two choices of a block adjacent to v; both of the adjacent 
blocks are also on the boundary too, so the expected change in the Hamming 
distance is less than 



n-2h-w + 2 ' n-2h~w + 2 n - 2h - w + 2 ' 

It can be easily verified that in the other cases where v is on or close to a free 
boundary the values given above are also well within what we need for rapid 
mixing. 

If we have to deal with fixed, arbitrary boundary conditions — that is, a fixed 
proper coloring of the sites just outside the region — applying our Markov process 
directly will not give us the same result. The problem is that the ratio between 
the number of blocks containing v and adjacent to v is the same as in the free 
boundary case, but the expected change in Hamming distance for any block 
adjacent to v is the same as in the toroidal C&SG, clS opposed to the lower values 
we obtain for free boundary conditions above. 

An alternative modification of our basic Markov process that will work is to 
choose any block that contains any vertex in the region, and then (properly) 
recolor all the vertices in that block which are also in the region. For a lattice of 
height h and width w with n vertices this gives us n + 2h + w + 2 possible 2x3 
blocks to chose from; more importantly, for every vertex in the lattice there are 
6 available blocks which contain it. 

The expected change in the Hamming distance when the block is on the 
boundary but fully contained in the lattice is now no different than the expected 
change for a block on the interior, but we do have to deal with cases where the 
block may be partially outside the lattice. When this happens we can merely 
treat the sub-block that we do recolor in exactly the same manner as wc would 
treat a smaller block being recolored on the interior (that is, we reduce the size 
of the block to contain only the recolorable part). The calculations for smaller 
blocks adjacent to v were of course already done in order to confirm previous 
results or rule out the possibility of rapid mixing for block sizes below 2x3, and 
the results were: 



Sub-block size Max expected change 

1 X 1 < 0.50 

2 X 1 or 1 X 2 < 0.524 

1x3 < 0.514 

2x2 < 0.508 

2x3 < 0.52 (i.e. the interior case) 



As an example, when v is in the corner of the lattice we now have 6 choices of a 
block containing and five choices of a block adjacent to v; 2 of these choices 
are full 2x3 blocks, and the remaining choices have sub-block sizes of 2 x 1, 2x2, 
and 1x3 respectively. Hence the expected change in the Hamming distance is 
less than 

6 0.52 x 2 -1-0.524 + 0.508-^0.514 3.414 
-1 X 1 = . 

n + 2h + w + 2 n + 2h + w + 2 n + 2h + w + 2 

We could work out the cost explicitly for the rest of the cases where the distin- 
guished vertex v is on or near the boundary, but it is easier just to note that 
in no case can the expected cost of coloring a block adjacent to v be more than 
0.524, and there can never be more than 10 adjacent blocks to choose from, so 
the expected change in the Hamming distance can never be more than 

0.76 

n^2h + w + 2 

□ 

Of course, we still need to prove that the desired couplings exist for each 
possible Cx , Cy ■ These couplings were found with the aid of computer programs. 
In principle, for any pair Cx^Cy, searching for the coupling that minimizes 
H{Cx, Cy) subject to ^ is simply a matter of solving a linear program and so 
can be done in polynomial time. However, the number of variables is |Cx||C'y| 
which, a priori, can be roughly (5^)^. Furthermore, the number of possible pairs 
X,Y is roughly 6^", and even after eliminating pairs which are redundant by 
symmetry, it is enormous. To deal with this combinatorial explosion we designed 
a fast heuristic which, rather than finding the best coupling for a particular pair, 
simply finds a very good coupling; i.e., one that satisfies (|2Jl. The code used can 
be found at www . cs . toronto . edu/~f vb. We provide a more detailed description 
in the next section. 

3 The Programs Used 

Method of the computation: Let R denote the rim vertices, that is, those 
vertices which are adjacent to but outside of the block S. We call a coloring of the 
vertices of i? a rim coloring. For each possible pair of rim colorings X, Y which 
differ only at a vertex v G R, we need to find a coupling between the extensions 
Cx and Cy of X,Y to 5, so that the couplings satisfy and (|2J). These 



couplings were found with a small suite of programs working in two phases. In 
the first phase, exhaustive lists of pairs of rim colorings (reduced by equivalence 
with respect to allowable block colorings) were generated. In the second phase, 
for each pair X,Y, we generated Cx,Cy separately; these were then coupled, 
satisfying in a nearly optimal way to obtain a bound on H{Cx, Cy) that 
satisfies 

Implementation: All programs take the following parameters: number of 
colors, block dimensions, and an integer denoting the position of the distin- 
guished vertex v with respect to the block (0 if adjacent to the corner, +i if 
adjacent to the ith vertex along the top of the block, and —i if adjacent to 
the ith vertex along the side of the block) . We assume that v has color in X 
and 1 in Y . Thus, if one specifies a rim coloring X and the position of v, then 
this determines Y. For each coloring X we determined a good coupling for each 
non-equivalent position of v. 

By default the programs generate rim colorings and couplings on the as- 
sumption that the block is not on the boundary of the lattice (i.e., all rim ver- 
tices potentially constrain the allowable block colorings). Free boundary cases, 
however, can easily be simulated by using values in the rim colorings that are 
outside the range determined by the numbcr-of-colors parameter. These were 
only checked when the analysis of the non-boundary blocks yielded promising 
values. As mentioned above, the fixed arbitrary boundary cases required no new 
calculations, since from the block's "point of view" there is no difference (in the 
worst case) between being on an arbitrarily colored boundary and being in the 
interior. Thus, we simply reused previously calculated values for smaller blocks 
on the interior to verify that the modified Markov process would work. 

Generating rim colorings: Since the calculations required for phase 2 were 
much more time-consuming than those for phase 1, the rim coloring generation 
procedure was designed to minimize the number of colorings output rather than 
the time used generating them. A rim coloring is represented by a vector of colors 
used on the rim, starting from the distinguished vertex v and going clockwise 
around the block. If the set of colors is {0, ... , 5}, we can assume by symmetry 
that v^s color is in X and 1 in F. The following reductions were applied to avoid 
equivalent rim colorings: reduction by color isomorphism (colors 2 and above), 
by exchange of colors and 1, by exchange of colors of vertices adjacent to the 
corners of the block, and by application of flip symmetries where applicable. 

Finding a coupling for particular rim colorings: Two programs were 
used for each rim coloring X, and position i of v. In each, the initial operation 
is the generation of all compatible lattice colorings; this is done separately for 
col(w) = and col(w) = 1 (i.e., for Cx and Cy)- The first program creates a set 
of linear programming constraints that is readable by the program Ip-solve (by 
Michel Berkelaar of the Eindhoven University of Technology; it is available with 
some Linux distributions). As mentioned above, time and space requirements 
made use of this procedure feasible only for checking individual rim colorings, 
and even then the block size had to be fairly modest. 



The second program calculates an upper bound on the optimal cost using a 
greedy algorithm to create a candidate coupling. Given sets of colorings Cx and 
Cy of size mx and my respectively, the algorithm starts by assigning "unused" 
probabilities of l/mx and l/my respectively to the individual colorings. Then, 
for each distance d = 0, 1, 6, for each coloring ci in Cx it traverses Cy looking 
for a coloring C2 which differs from ci on exactly d vertices. When such an C2 is 
found it removes the coloring with the lower unused probability p from its list 
and reduces the unused probability p' of the other to p' — p; the distance d ■ p is 
added to the total distance so far. The order in which the lists of colorings Cx 
and Cy is traversed does affect the solution, so an optional argument is available 
that allows the user to select one of several alternatives. 

This heuristic does not guarantee an optimal solution, and with some blocks 
and particular rim colorings the coupling it generates is far from the best. How- 
ever, for the rim colorings we are most interested in (ones where H{Cx, Cy) is 
high for all couplings) it seems to consistently give results that are optimal or 
very close (within 2%). We cannot give a rigorous bound on the running time, 
but a cursory analysis and empirical evidence suggest that it runs in roughly 
O(mlogm) time, where m is the number of compatible block colorings. Because 
the heuristic is so much faster than the LP solver, our general procedure was as 
follows: (1) Use the heuristic with the default traversal order to calculate bounds 
on the expected distance for all the rim colorings generated in phase one. (2) 
When feasible, use the LP solver on those rim colorings that had the highest 
value of _ff(Cx, Cy), to obtain an exact value for their maximum. (3) For larger 
blocks or more colors than could be comfortably handled by the LP solver, use 
all available traversal orders on those rim colorings that had the maximum value 
of H{Cx, Cy) to obtain as tight a bound as possible within a feasible time. 

Results of the computations: Computations were run on various block 
sizes and numbers of colors in order to check the correctness of the programs, 
and also to collect data which could be used to estimate running times and 
maximum expected distance for larger block dimensions. For 7- and 8-colorings 
our results corresponded well with previous work on the problem (e.g. 2 ). 

For 6-colorings, we checked 1 x fc blocks for fc < 5, as well as 2 x 2 and 
2x3 blocks. For all but the last of these the maximum expected distance we 
obtained was too large to give us rapid mixing. The 2x3 subgrid has 2 non- 
equivalent positions with respect to the rim: the corner (to which 8 rim vertices 
are adjacent) and the middle of the top or bottom side (to which 2 rim vertices 
are adjacent). Denote these positions 1 and 2 respectively. For each X, Y with v 
in position 1, we obtained a coupling satisfying: 

H{Cx,Cy) < 0.5118309760 . 
For each X, Y with v in position 2, we obtained a coupling satisfying: 

H{Cx,Cy) < 0.4837863092 . 
Thus, in each case we satisfy ((2J as required. 



A slightly stronger output: By examining the problem a bit more closely, 
we see that condition ||2Jl is sufficient, but not necessary, for our purposes. Let 
Hi denote the maximum of H(Cx,Cy) over the couplings found for all pairs 
X, Y where v is in position i, and let mult^ denote the number of rim vertices 
adjacent to position i. Then, being more careful about the calculation used in 
the proof of Theorem and extending it to a general a x b block, we see that 
the overall expected change in the Hamming distance is at most (in, say, the 
toroidal case where the number of possible blocks is n) 

ah v-^ , Hi 

— 1 X h > multj X — , 

n ^-^ n 

i 

a smaller value than that used in the proof of Theorem 1 , where we (implicitly) 
used {maxi Hi) x ^ . mult^ rather than mult^ x Hi. Our programs actually 
compute this smaller value. Even so, we could not obtain suitable couplings for 
any block size smaller than 2x3. 



4 Rapid Mixing: Glauber and Kempe Chain Dynamics 

In this section we prove Theorem |21 showing rapid mixing for the Glauber and 
Kempe chain dynamics, by following the techniques and presentation of Randall 
and Tetah 

Suppose P and P are two Markov chains on the same state space with the 
same stationary distribution tt, and that we already have a bound on the mixing 
time of P while we would like to obtain a bound for that of P. Let E{P) and 
E{P) denote the edges of these Markov chains, i.e., the pairs {x,y) such that 
the transition probabilities P{x,y) and P{x,y), respectively, are positive. Now, 
for each edge of P, i.e., each {x,y) G E{P), choose a fixed path 'y^^y using the 
edges of P: that is, choose a series of states x = xo,xi,X2, ■ ■ ■ ,Xk — y such 
that {xi,Xi+i) £ E{P) for < i < A:. Denote the length of such a path \jx,y\- 
Furthermore, for each {z,w) £ E{P), let r{z,w) C E{P) denote the set of pairs 
(x, y) such that jx,y uses the edge (z, w). Finally, let 



A = max 

{z,w)eE{p) 



1 



tt(z)P(z,w) ^-^ 



\lx,y\T^(x)P{x,y) 



Note that A depends on our choice of paths. 

By combining bounds on the mixing time in terms of the spectral gap |7l22l21j 
with an upper bound on P's spectral gap in terms of P's due to Diaconis and 
Saloff-Coste ^ ^ we obtain the following upper bound on P's mixing time: 

Theorem 3. Let P and P be reversible Markov chains on q-colorings of a graph 
of n vertices whose unique stationary distribution is the uniform distribution. Let 
Ai(P) be the largest eigenvalue of P's transition matrix smaller than 1, let 



and fe denote the e-mixing time of P and P respectively, and define A as above. 
Then for any e < 1/4, 

^ 41ogg , 
Tf < —AnTf . 

- Ai(F) 

We omit the proof. The reason for the additional factor of n is the fact that 
the upper and lower bounds on mixing time in terms of the spectral gap are 
logl/TTn. apart, where tt* is the minimmn of t:{x) taken over all states x. Since 
TT in this case is the uniform distribution and there are at most q" colorings, we 
have logl/7r(a;) < nlogq. On the square lattice, it is easy to see that there are 
an exponentially large number of g-colorings for g > 3, so removing this factor 
of n would require a different comparison technique. 

Now suppose that P is the block dynamics and P is the Glauber or Kempe 
chain dynamics. We wish to prove Theorem|21by showing that = 0{nf^). By 
adding self-loops with probability greater than 1/2 to the block dynamics, we 
can ensure that the eigenvalues of P are positive with only a constant increase 
in the mixing time. Therefore, it suffices to find a choice of paths for which A 
is constant. Since, for all three of these Markov chains, each move occurs with 
probability 0{l/n), if |7a;,j,| and \r{z,w)\ are constant then so is A. 

In fact, for q > A + 2, we can carry out a block move on any finite neighbor- 
hood with Glauber moves. We need to flip each vertex u in the block to its new 
color; however, it's flip is blocked by a neighbor v if w's current color equals m's 
new color. Therefore, we flrst prepare for u's flip by changing v to a color which 
differs from w's new color as well as that of w's A neighbors. If the neighborhood 
has m vertices, this gives \jx.y\ < m{A + 1), or \jx.y\ < 30 for A/(2, 3). (With a 
little work we can reduce this to 13.) 

For the Kempe chain dynamics, recall that each move of the chain chooses 
a vertex v and a color b other than u's current color. If b is the color which 
the Glauber dynamics would assign to v, then none of w's neighbors are colored 
with 6, and the Kempe chain move is identical to the Glauber move. Since this 
happens with probability 1/q, the above argument applies to Kempe chain moves 
as well, and again we have \^x,y \ £ m(Z\-|-l). Moreover, we only need to consider 
moves that use Kempe chains of size 1. 

Finally, since each vertex appears in only to = 6 blocks, the number of block 
moves that use a given Glauber move or a given Kempe chain move of size 1 is 
bounded above by m times the number of pairs of colorings of the block. Thus 
\r{z,w)\ < TO(g'")^, and we are done. 

An interesting open question is whether we can prove optimal temporal mix- 
ing for the Glauber or Kempe chain dynamics. One possibility is to use log- 
Sobolev inequalities as in We leave this as a direction for further work. 



5 Conclusion: Larger Blocks and Smaller qr? 

We have run our programs on 2 x 4 and 3x3 blocks to see if we could achieve 
rapid mixing on 5 colors, but in both cases the largest values of H{Cx ,Cy) 
were too high. It may be that rapid mixing on 5 colors is possible by recoloring 



a 3 X 4 block, based on the decrease of the ratio of ma.xH{Cx ,Cy) ■ \R\ to \S\ as 
the dimensions increase; similar reasoning leads us to believe that rapid mixing 
using a 2 X fc block is possible, but we would probably need a 2 x 10 block or 
larger to achieve success. Unfortunately, doing the calculations for 3 x 4 blocks 
is a daunting proposition. The problem is exponential in two directions at once 
(number of rim colorings, and number of block colorings for each rim coloring), 
so this would require a huge increase in the running time. 
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